A comparison and combination of segmental and fixed-frame signal representations in NMF-based word recognition
نویسندگان
چکیده
Segmental and fixed-frame signal representations were compared in different noise conditions in a weakly supervised word recognition task using a non-negative matrix factorization (NMF) framework. The experiments show that fixed-frame windowing results in better recognition rates with clean signals. When noise is introduced to the system, robustness of segmental signal representations becomes useful, decreasing the overall word error rate. It is shown that a combination of fixed-frame and segmental representations yields the best recognition rates in different noise conditions. An entropy based method for dynamically adjusting the weight between representations is also introduced, leading to near-optimal weighting and therefore enhanced recognition rates in varying SNR conditions.
منابع مشابه
Iterative Weighted Non-smooth Non-negative Matrix Factorization for Face Recognition
Non-negative Matrix Factorization (NMF) is a part-based image representation method. It comes from the intuitive idea that entire face image can be constructed by combining several parts. In this paper, we propose a framework for face recognition by finding localized, part-based representations, denoted “Iterative weighted non-smooth non-negative matrix factorization” (IWNS-NMF). A new cost fun...
متن کاملبهبود عملکرد سیستم بازشناسی گفتار پیوسته بوسیله ویژگیهای استخراج شده از مانیفولدهای گفتاری در فضای بازسازی شده فاز
The design for new feature extraction methods out of the speech signal and combination of their obtained information is one of the most effective approaches to improve the performance of automatic speech recognition (ASR) system. Recent researches have been shown that the speech signal contains nonlinear and chaotic properties, but the effects of these properties are not used in the continuous ...
متن کاملAIOSC: Analytical Integer Word-length Optimization based on System Characteristics for Recursive Fixed-point LTI Systems
The integer word-length optimization known as range analysis (RA) of the fixed-point designs is a challenging problem in high level synthesis and optimization of linear-time-invariant (LTI) systems. The analysis has significant effects on the resource usage, accuracy and efficiency of the final implementation, as well as the optimization time. Conventional methods in recursive LTI systems suffe...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملImproving LNMF Performance of Facial Expression Recognition via Significant Parts Extraction using Shapley Value
Nonnegative Matrix Factorization (NMF) algorithms have been utilized in a wide range of real applications. NMF is done by several researchers to its part based representation property especially in the facial expression recognition problem. It decomposes a face image into its essential parts (e.g. nose, lips, etc.) but in all previous attempts, it is neglected that all features achieved by NMF ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009